24 research outputs found

    Find me if You Can: Aligning Users in Different Social Networks

    Get PDF
    Online Social Networks allow users to share experiences with friends and relatives, make announcements, find news and jobs, and more. Several have user bases that number in the hundred of millions and even billions. Very often many users belong to multiple social networks at the same time under possibly different user names. Identifying a user from one social network on another social network gives information about a user\u27s behavior on each platform, which in turn can help companies perform graph mining tasks, such as community detection and link prediction. The process of identifying or aligning users in multiple networks is called network alignment. These similar (or same) users on different networks are called anchor nodes and the edges between them are called anchor links. The network alignment problem aims at finding these anchor links. In this work we propose two supervised algorithms and one unsupervised algorithm using thresholds. All these algorithms use local structural graph features of users and some of them use additional information about the users. We present the performance of our models in various settings using experiments based on Foursquare-Twitter and Facebook-Twitter data (User Identity Linkage Dataset). We show that our approaches perform well even when we use the neighborhood of the users only, and the accuracy improves even more given additional information about a user, such as the username and the profile image. We further show that our best approaches perform better at the HR@1 task than unsupervised and semi-supervised factoid embedding approaches considered earlier for these datasets

    Viral Marketing for Smart Cities: Influencers in Social Network Communities

    Get PDF
    Social networks are used by cities primarily for announcing local-area events, but also for increasing engagement of citizens in votes and elections. Given the current plethora of heterogeneous social networks, city administrators can benefit from social networks to promote initiatives, which are important to a current smart city as well use them to discover future needs in order to manage resources more efficiently. Our focus in this paper is how we can adapt commercial and viral marketing techniques to smart city systems to influence the behavior, opinion and choices of citizens in order to improve their well being and that of the whole society as well as predicting future trends and events

    Graph Classification with Kernels, Embeddings and Convolutional Neural Networks

    Get PDF
    In the graph classification problem, given is a family of graphs and a group of different categories, and we aim to classify all the graphs (of the family) into the given categories. Earlier approaches, such as graph kernels and graph embedding techniques have focused on extracting certain features by processing the entire graph. However, real world graphs are complex and noisy and these traditional approaches are computationally intensive. With the introduction of the deep learning framework, there have been numerous attempts to create more efficient classification approaches. We modify a kernel graph convolutional neural network approach, that extracts subgraphs (patches) from the graph using various community detection algorithms. These patches are provided as input to a graph kernel and max pooling is applied. We use different community detection algorithms and a shortest path graph kernel and compare their efficiency and performance. In this paper we compare three methods: a graph kernel, an embedding technique and one that uses convolutional neural networks by using eight real world datasets, ranging from biological to social networks

    Support Vector Machines for Image Spam Analysis

    Get PDF
    Email is one of the most common forms of digital communication. Spam is unsolicited bulk email, while image spam consists of spam text embedded inside an image. Image spam is used as a means to evade text-based spam filters, and hence image spam poses a threat to email-based communication. In this research, we analyze image spam detection using support vector machines (SVMs), which we train on a wide variety of image features. We use a linear SVM to quantify the relative importance of the features under consideration. We also develop and analyze a realistic “challenge” dataset that illustrates the limitations of current image spam detection techniques

    Community Detection via Neighborhood Overlap and Spanning Tree Computations

    Get PDF
    Most social networks of today are populated with several millions of active users, while the most popular of them accommodate way more than one billion. Analyzing such huge complex networks has become particularly demanding in computational terms. A task of paramount importance for understanding the structure of social networks as well as of many other real-world systems is to identify communities, that is, sets of nodes that are more densely connected to each other than to other nodes of the network. In this paper we propose two algorithms for community detection in networks, by employing the neighborhood overlap metric and appropriate spanning tree computations

    CoDiS: Community Detection via Distributed Seed Set Expansion on Graph Streams

    No full text
    Community detection has been (and remains) a very important topic in several fields. From marketing and social networking to biological studies, community detection plays a key role in advancing research in many different fields. Research on this topic originally looked at classifying nodes into discrete communities (non-overlapping communities) but eventually moved forward to placing nodes in multiple communities (overlapping communities). Unfortunately, community detection has always been a time-inefficient process, and datasets are too large to realistically process them using traditional methods. Because of this, recent methods have turned to parallelism and graph stream models, where the edge list is accessed one edge at a time. However, all these methods, while offering a significant decrease in processing time, still have several shortcomings. We propose a new parallel algorithm called community detection with seed sets (CoDiS), which solves the overlapping community detection problem in graph streams. Initially, some nodes (seed sets) have known community structures, and the aim is to expand these communities by processing one edge at a time. The innovation of our approach is that it splits communities among the parallel computation workers so that each worker is only updating a subset of all the communities. By doing so, we decrease the edge processing throughput and decrease the amount of time each worker spends on each edge. Crucially, we remove the need for every worker to have access to every community. Experimental results show that we are able to gain a significant improvement in running time with no loss of accuracy
    corecore